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ABSTRACT:  Maximum  likelihood  theory  has  been  applied  to  the 

analysis  of  scattered  sensitivity  data.  The  analysis  can  be  used 
also  for  collected  data.  The  logistic  distribution  is  assumed. 
The  calculation  of  percent  points  with  their  confidence  limits 
is  illustrated.  A  program  for  the  IBM  7090  computer  is  included. 
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Maximum  Likelihood  Logistic  Analysis  of  Scattered 
Go/No-Go  ( Quant al)  Data 

This  report  gives  the  results  of  work  done  to  adapt  existing 
statistical  techniques  in  sensitivity  experiments  to  the  case 
in  which  the  logistic,  rather  that  the  normal,  distribution  is 
assumed.  The  use  of  the  logistic  distribution  gives  a  somewhat 
better  fit  to  sensitivity  data,  and  also  more  conservative 
estimates  of  the  reliability  and  safety  and  is,  therefore,  con¬ 
sidered  preferable  to  the  use  of  the  normal  distribution.  The 
work  was  carried  out  under  Task  NOL  4'113/NWL.  The  method  of 
analysis  is  applicable  to  any  type  of  quantal  data.  It  is 
particularly  valuable  when  the  stimulus  cannot  be  controlled 
precisely  but  can  be  measured  accurately.  It  should  be  of 
interest  to  those  working  with  ordnance,  explosives,  missiles, 
airframes,  and  space  vehicles.  It  might  be  of  interest  to 
agricultural  and  biological  disciplines  dealing  with  the  response 
of  living  organisms  to  toxic  environments,  particularly  where 
the  actual  intake  of  toxic  material  by  each  individual  can  be 
measured,  such  as  lethality  of  radiation  dosage  or  heavy-metal 
poisoning. 

DARE 

Captain,  USN 
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INTRODUCTION 

A  situation  frequently  arising  in  experimental  work  is  that 
of  go/no-go  testing  associated  with  a  continuous  variable  which 
cannot  be  measured  as  such  in  practice.  An  example  of  this  is 
the  determination  of  the  sensitivity  of  an  explosive  to  shock. 

The  shock  to  which  an  explosive  is  subjected  is  a  continuous 
variable.  It  can  be  assumed  that  there  is  a  critical  value  of 
the  shock  for  each  test  specimen  such  that  the  explosive  would 
respond  to  shocks  greater  than  this  value  and  fail  to  respond  for 
lesser  shocks.  Therefore,  in  practice  all  that  can  be  determined 
is  that  some  known  shock  is  greater  or  less  than  the  critical 
value;  i.e.,  that  the  explosive  did  or  did  not  explode.  How 
close  the  explosive  came  to  firing  or  failing  is  not  detected. 


The  treatment  of  such  data  when  the  stimulus  can  be 
assigned  predetermined  values  has  been  discussed  by  C.  I.  Bliss^ 
and  the  Statistical  Research  Group  of  Princeton  University*, 
among  others.  These  writers  have  assumed  that  the  data  follow 
a  normal  frequency  distribution.  Joseph  Berkson®  has  considered 
the  same  problem  assuming  the  logistic  distribution. 

Golub  and  Grubbs*  have  analyzed  the  treatment  of  data  of 
this  kind,  considering  the  possibility  that  the  stimulus  cannot 
be  precisely  determined  in  advance  but  can  be  measured  accurately. 
In  this  case  the  experiment  usually  consists  of  a  set  of  trials 
each  with  a  different  stimulus,  for  each  of  which  a  response  or 
non-response  is  noted.  As  an  example,  Golub  and  Grubbs  described 
an  experiment  to  determine  the  velocity  at  which  an  armor-piercing 
projectile  will  penetrate  a  given  armor  plate.  Five  trials  were 
made,  two  of  which  resulted  in  penetrations.  The  range  of 
velocities  for  which  penetrations  were  observed  overlapped  the 
range  for  which  non-penetrations  were  observed.  This  zone  of 
mixed  response  is  essential  in  the  analysis.  Using  these  data, 
they  obtained  an  estimate  of  the  mean  and  standard  deviation  of 
the  vel^ity  required  for  penetration,  assuming  a  normal  distribu¬ 
tion.  The  purpose  of  this  report  is  to  give  a  similar  method  of 
analysis  when  the  logistic  distribution  is  assumed. 

STATISTICAL  MODEL 

For  the  logistic  distribution. 

t  =  ,  =  Bx  +  a  (1) 

7  /  .  ■ 

In  equation  (1),  x  is  the  independent  variable  (stimulus),  and 
II  and  y  are  parameters  of  the  logistic  distribution.  The 
parameter  ii  has  the  same  meaning  as  it  has  in  the  normal  distribu- 
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tion,  being  a  measure  of  the  location  of  the  center  of  the 
distribution.  The  parameter  y  is  similar  to  but  not  the  same 
as  a  ,  the  standard  deviation  of  the  normal  distribution.  It 
is  a  measure  of  the  dispersion  of  the  population.  When  the 
cumulative  function  is  plotted  in  the  logistic  probability 
space,  y  is  the  reciprocal  of  the  slope. 

In  discussing  properties  of  distribution  functions,  it  is 
usually  convenient  to  transform  the  independent  variable,  x  , 
to  a  standardized  variable.  The  letter  t  is  often  used  to 
denote  this  variable. 


In  terms  of  this  standardized  variable  the  distribution 
will  have  a  mean  of  zero  and  its  dispersion  parameter  (  y  in  this 
case) ,  will  be  unity.  The  first  equality  of  (1)  is  the  equation 
which  makes  this  transformation.  The  second  equality  expresses 
the  distributional  relationship  in  the  form  of  a  simple  linear 
equation  where  A  and  B  are  constants. 


It  should  be  noted  that  a  value  of  y  in  the  logistic 
distribution  corresponds  to  about  73%  response  rather  than  84% 
as  in  the  normal  distribution.  The  value  of  y  in  equation  (1) 
is  therefore  somewhat  less  than  two-thirds  of  the  value  of  a 
in  the  normal  distribution.  The  expected  probability,  p  ,  can 
be  expressed  in  terms  of  t  by  the  relation 


P 


1 


1+e-'  1+e' 


1  _  q 


(2) 


q 


1 

1  +  e» 


These  values  of  p  and  q  are  the  expected  probabilities  of  a 
success  or  failure  for  that  value  of  t  for  the  assumed  distribu 
t  ion . 


MAXIMIZING  THE  LIKELIHOOD  FUNCTION 


The  likelihood  function,  P  ,  is  the  probability  that  the 
complete  set  of  responses  as  observed  will  occur.  Since  these 
events  are  assumed  to  be  independent,  the  probability  of  observing 
the  set  will  be  the  product  of  the  probabilities  of  the  separate 
observations.  P  can  therefore  be  written  as 


where 


P 


n 

n 

i=i 


(3) 


n 

11 

i=i 


indicates  the  product  of  the  probabilities 
P  1  /  P2  /  P3  » . Pn 
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n  =  number  of  successful  responses 
m  =  number  of  unsuccessful  responses. 

Rather  than  maximize  P  it  is  more  convenient  to  maximize 
its  logarithm,  L  .  This  can  be  written  as 

n  m 

L  =  S  £n  p.  +  S  £n  q.  ^4) 


Here 


n  i=1  i=l 

]£  £n  p.  =  £np.j  +  £nP2  +  . Pn 

i  =  1 


In  order  to  maximize  L  we  find  its  partial  derivatives  with 
respect  to  y  and  and  equate  these  to  zero.  These  partial 
derivatives  can  be  found  easily  by  substituting  the  values  of 


terms 

Of 

t  as  given 

in  equation  (2)  . 
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Then  q  |_ 

dL 

(9t  1 

dfi 

17 

d/i  y 

dL 

dL 

dt  1  1 

dy 

17 

1 

II 

SOLUTION  FC 

m  n  T 

J  =  °  ® 

S  p.  t.  -  S  q.  t.  I  =  0  (6) 

i  =  r '  '  i  =  i  ■  J 

fi  AND  y 


The  Newton-Raphson  criterion  procedure  may  be  used  to 
solve  these  equations  for  fi  and  y  ,  provided  first  estimates 

and  y^  can  be  found  which  are  sufficiently  close  to  the  true 
values.  This  procedure  uses  the  two  equations 


d^L 

(9  ,.2 


d^L 
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(9L 

dij. 
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d^L 
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dy 
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to  obtain  new  estimates  of  ii  and  y  by  adding  Au  and  Ay 
to  the  previous  estimates : 

/^1  =  Mo  + 

y,  =  Xo  +  ^y 

The  expressions  for  the  second  partial  derivatives  required  in 
ecpiations  (7)  and  (8)  are 
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d^L  2. 

^  ^  7 


We  start  with  reasonably  good  estimates  of  /x  and  tf  which  can  be 
used  as  /^o  and  in  equations  (7)  and  (8)  to  find  new  estimates 
1^1  and  .  This  process  is  repeated  until  the  corrections  A|x 
and  Ay  become  acceptably  small.  The  process  will  diverge  if 
the  original  estimates  are  not  sufficiently  good.  The  estimate 
of  y  is  the  most  critical:  it  must  not  be  too  large.  Even  with 
a  perfect  estimate  of  the  mean  the  process  will  diverge  i f  ^ 
e^tijaate  of  „  y„.i„s  twentyrf  ive  per  cent  high^^  In  this 

connection  it  should  be  remembered  that,  as  pointed  out  above, 
the  y  of  the  logistic  distribution  is  smaller  than  the  cr  of  the 
normal  distribution.  A  good  rule  to  follow  would  be  to  estimate 
the  fifty  per  cent  point  as  closely  as  possible  along  with  a 
good  guess  of  the  sixty-five  or  seventy  per  cent  point.  The 
difference  of  these  points  could  be  used  as  the  initial  estimate 
y^  .  In  case  of  doubt  it  is  better  to  take  small  rather 
than  large.  If  y^ is  taken  so  large  that  the  process  does  diverge 
a  much  smaller  value  should  be  chosen  and  the  process  begun  again 

NUMERICAL  EXAMPLE 

For  a  numerical  example  we  take  the  data  used  bv  Golub  and 
Grubbs.  As  a  first  estimate  we  use  V-o^  2435  and  yp  ™/l0.5.7 
The  data  and  values  of  v,  t,  Pj,  qj,  and  p.q.are  tabulated  here. 


[ 


m  +  n 

-  s 

i=i 


p,  q .  t. 
n  I 


2  S  t. 

i=i  ' 


2  S  Pit, 


i=  1 


■J 


(9c) 


Expected  Values  Assuming  s  24)35:  . 

y,b*i0i.5 

•t  _ 

P 

_ _ - 

_ 

(m)  failure 

2415 

-1.905 

_ _ 

3.629 

0.12.96 

0.8704 

0.1128 

failure 

2415 

-1.905 

3.629 

0.1296 

0.8704 

0.1128 

failure 

2433 

-0.190 

0.036 

0.4527 

0.5473 

0.2478 

(n)  success 

2423 

-1.143 

1.306 

0.2132 

0.7868 

0.1677 

success 

2453 

1.714 

2.938 

0.8473 

6.1527 

0.1294 

The  required  partial  derivatives  are 


dL  1 

^  -  IT  (-0.2276)* 

at  1  i 

^  -  77  i; 
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d^L  1 

—T=—  (-0.7705) 

Yo 

1 

— —  =  —  (0.6743) 

dfidy 

d^L  1 

^  =  -2  (-1.5424) 

dy  Yo 

Substitution  in  equations  (7)  and  (8),  and  multiplication  by 
gives 

-0.7705  +  0.6743  Ay  »  2.3898 

0.6743  -  1.5424  Ay  «  -0.6069 

Solving  these  we  get  A^  =  -4.47  and  Ay  ■  -1.56  so  that  our 
new  estimates  become 


»  2435.0  -  4.47  =  2430.53  and 

y^  «  10.5  -  1.56  *  8.94. 

The  computations  are  then  repeated  using  for  and  y^  for 
y^ .  This  iterative  process  is  continued  until  the  corrections 
become  small  enough  to  be  considered  negligible.  For  this 
example  the  fourth  iteration  gives  /I4  =  2431.93  and  y^  *  9.52 
with  satisfactorily  small  corrections. 


STANDARD  ERRORS  OF  //  AND  y 

Confidence  limits  can  be  assigned  to  these  estimates  by 
finding  their  standard  errors .  Even  though  we  have  assumed  the 
logistic  distribution  for  the  data,  the  estimates  of  y.  and  y 
will  have  a  distribution  which  is  asyn^totically  normal® 

Their  standeurd  errors  can  be  calculated  by  evaluating  the 
variance-covariance  matrix  which  can  be  obtained  as  the  inverse 
of  the  matrix  of  the  negatives  of  the  expected  values  of  the 
second  partial  derivatives. 


-'(0) 

-Ef  \ 

\  ^\i-dy  J 

-  1 

4 

^I^Y 

-E  (  ) 

-e( 

c 

c2 

V  df^dy  } 

\  ^  / 

f^Y 

V 

In  the  numerical  example  the  expected  values  of  the  second 
partial  derivatives  for  the  last  iteration  are 
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This  gives 


-0.0087997 

0.0045179 

-0.015695 


0.0087997 

-0.0045179 

-  1 

133.35 

38.38 

-0.0045179 

0.015695 

38.38 

74.76 

SO  that 

133.35 

2 

V 

=  74.76 

s  = 

11.55 

8.64 

PREDICTION  OF  PER  CENT  POINTS  AND 
THEIR  STANDARD  ERRORS 


In  order  to  predict  per  cent  points  and  to  assign  confidence 
limits  to  these  points,  we  can  proceed  as  follows.  The  expected 
value  of  any  per  cent  point  ^  where  P  is  the  probability 
expressed  in  per  cent,  is  given  by 


^  <^y  where 


The  standard  deviation  of  this  estimate  is  given  by 
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The  confidence  limits  on  the  estimate  of  will  be  obtained 
by  adding  or  subtracting  from  the  quantity  ks  where  k  is 
the  stamdardized  variable  in  the  normal  distribution  associated 
with  the  desired  confidence.  In  our  numerical  exan^le  we  find 
the  ninety-nine  per  cent  point  as  follows.  We  find  that 
c  »  4.5951  so  that 

«  2431.93  +  (4.5951) (9.52)  =2475^68 

S99  +  (2r.'il49)  (74.76)  *  41.375 

the  upper  one-sided  95%  confidfciice  limit  on  X99  is 

X95  +  1.645  S99  *  2543.74  . 

To  compare  our  results  with  those  obtained  by  Golub  and 
Grubbs  with  the  normal  distribution,  we  have  tabulated  the 
estimates  for  several  per  cent  points  as  predicted  by  both  , 
calculations  together  with  the  upper  95%  confidence  limits  as 
computed  above . 


Logistic 


Per  Cent  ^ 

Normal 

Expected 

Upper  Limit 

75 

2441.7 

2442.4 

2467.0 

90 

2450.8 

2452.8 

2489.4 

95 

2456.3 

2460.0 

2506.0 

99 

2466.5 

2475.7 

<•  i2543.7 

These  results  show  the  longer  tails  associated  with  the  logistic 
distribution  as  compared  with  the  normal. 

SUMMARY  AND  COMPARISON  WITH  BERKSON'S  METHOD 

This  method  makes  it  possible  to  obtain  an  estimate  of  the 
stimulus  necessary  to  produce  a  desired  response  assuming  a 
logistic  distribution  for  the  data.  It  is  also  possible  to 
assign  confidence  limits  to  this  estimate.  A  FORTRAN  II  program 
for  carrying  out  the  required  computations  on  the  IBM  7090 
computer  has  been  written  and  has  been  in  use  at  the  Naval 
Ordnance  Laboratory.  This  program  is  given  as  Appendix  A  of 
this  report. 

Berkson’  has  used  the  maximum  likelihood  theory  to  evaluate 
the  constants  A  and  B  in  equation  (1) .  Here  A  and 

B  =  1/y  .  It  may  be  of  interest  to  note  that  Berkson's 
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method  has  a  different  region  of  convergence  than  the  method 
described  in  this  report.  In  Berkson's  method  y  can  be  large 
and  should  not  be  too  small.  As  exeunples  to  Illustrate  this 
point  we  can  use  the  Golub-Grubbs  data  and  let  x  =  v  -  2423. 

Then  the  fifty  per  cent  point  as  computed  above  will  be  8.93. 

If  we  start  with  estimates  =  2  and  y^  =  5  the  process 

described  in  this  report  will  converge.  For  the  eoarres ponding 
values  A  =  -0.4  and  B  =  0.2  Berkson’s  method  will  diverge.  On  the 
other  hand  andy^=20the  method  of  this  report  will  diverge 

whereas  for  the  corresponding  values  ofA  =  -0.25and  3  =  0.05  Berkson’ 
process  converges.  Berkson  does  not  give  estimates  of  the 
variances  of  A  and  B.  We  have  found,  by  using  the  variance- 
covariance  matrix,  that  the  asymptotic  variance  of  A'  is  given 
by  1/Sw  ,  and  of  B'by  Sw/Sw  (x  -  x'i^  ,  where  w  =  pq  when 
the  equation  is  written  in  the  form  t  =  B'(x-x)  +  A*. 
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